Skip to content

Conversation

@djeong20
Copy link
Contributor

@djeong20 djeong20 commented Dec 4, 2025

This pull request refactors the unit tests for OpenCL kernels.
The changes involve splitting the test files into three categories based on functionality.
One for INT4-related tests, another for QK_K (GGML format) tests, and a third for the remaining BLAS tests.
Additionally, it consolidates common helper functions used across the test files into separate utility header files.

Self-evaluation:

  1. Build test: [X]Passed [ ]Failed [ ]Skipped
  2. Run test: [X]Passed [ ]Failed [ ]Skipped

This pull request refactors the unit tests for OpenCL kernels.
The changes involve splitting the test files into three categories based on functionality.
One for INT4-related tests, another for QK_K (GGML format) tests, and a third for the remaining BLAS tests.
Additionally, it consolidates common helper functions used across the test files into separate utility header files.

**Self-evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test:   [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghyeon Jeong <[email protected]>
nntrainer::IniWrapper ini;
};

#define GEN_TEST_INPUT_NHWC(input, eqation_i_j_k_l) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about keeping the whitespace at the end of each line? And remove file diffs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the previous version has incorrect formatting, likely due to being formatted with an outdated version.
(The last time this file was edited was a year ago!)

Copy link
Contributor

@songgot songgot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

// Initialize Activation
std::vector<float> activation = generate_random_vector<float, false>(M * K);
float *activations_f32_ptr = (float *)allocateSVM(M * K * sizeof(float));
blas_cc->command_queue_inst_.enqueueSVMMap(activations_f32_ptr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After use, Right?
blas_cc->command_queue_inst_.enqueueSVMUnmap(activations_f32_ptr, 0, nullptr, nullptr);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. However, I believe that unmap should be called before kernel execution and should be handled here.

// weight 0 (3072 x 3072)
void *ws0 = allocateSVM(data_size_n0 / scale_group_size);
void *wq0 = allocateSVM(data_size_n0 / 2);
blas_cc->command_queue_inst_.enqueueSVMMap(wq0, data_size_n0 / 2, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't 'true' better than 'false'?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants